Vocabulary-Based Language Similarity using Web Corpora

نویسندگان

  • Dirk Goldhahn
  • Uwe Quasthoff
چکیده

This paper will focus on automatic methods for quantifying language similarity. This is achieved by ascribing language similarity to the similarity of text corpora. This corpus similarity will first be determined by the resemblance of the vocabulary of languages. Thereto words or parts of them such as letter n-grams are examined. Extensions like transliteration of the text data will ensure the independence of the methods from text characteristics such as the writing system used. Further analyzes will show to what extent knowledge about the distribution of words in parallel text can be used in the context of language similarity.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparable Corpora in Cross-Language Information Retrieval

Cross-language information retrieval (CLIR) enables users to express queries in a language different from the language of the documents to be retrieved. For example, a Finnish-speaking person could pose a query to a CLIR system in Finnish (the source language) to retrieve documents written in English (the target language). The language barrier is usually crossed by translating the query into th...

متن کامل

Impact of Using Web-quests on Learning Vocabulary by Iranian Pre-university Students

Web-quests are internet-based technology applications in which groups of students follow a specific set of steps toward the completion of a final project on a specific subject or a multi-disciplinary subject. The present study aimed to investigate the impacts of using web-quests on learning vocabulary by Iranian pre-university students. The sample of the study consisted of 72 students assigned ...

متن کامل

Impact of Using Web-quests on Learning Vocabulary by Iranian Pre-university Students

Web-quests are internet-based technology applications in which groups of students follow a specific set of steps toward the completion of a final project on a specific subject or a multi-disciplinary subject. The present study aimed to investigate the impacts of using web-quests on learning vocabulary by Iranian pre-university students. The sample of the study consisted of 72 students assigned ...

متن کامل

A Comparison of ESLE Web-based English Vocabulary Learning Application with Traditional Desktop English Vocabulary Learning Application: Exceptional learner parents’ point of view

The aim of this study was to compare the Exceptional Student Learning English (ESLE) web application and traditional application and the evaluation of the ESLE app mainly from the exceptional student parents' perspective. To this end, five exceptional student parents with their exceptional children were selected among 30 parents in Isfahan in Isfahan province. Open-ended questionnaires were sen...

متن کامل

Building parallel corpora by automatic title alignment using length-based and text-based approaches

Cross-lingual semantic interoperability has drawn significant attention in recent digital library and World Wide Web research as the information in languages other than English has grown exponentially. Cross-lingual information retrieval (CLIR) across different European languages, such as English, Spanish, and French, has been widely explored; however, CLIR across European languages and Orienta...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014